首页> 外文OA文献 >Content-based similar document image retrieval using fusion of CNN features
【2h】

Content-based similar document image retrieval using fusion of CNN features

机译:使用CNN融合的基于内容的类似文档图像检索   特征

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Rapid increase of digitized document give birth to high demand of documentimage retrieval. While conventional document image retrieval approaches dependon complex OCR-based text recognition and text similarity detection, this paperproposes a new content-based approach, in which more attention is paid tofeatures extraction and fusion. In the proposed approach, multiple features ofdocument images are extracted by different CNN models. After that, theextracted CNN features are reduced and fused into weighted average feature.Finally, the document images are ranked based on feature similarity to aprovided query image. Experimental procedure is performed on a group ofdocument images that transformed from academic papers, which contain bothEnglish and Chinese document, the results show that the proposed approach hasgood ability to retrieve document images with similar text content, and thefusion of CNN features can effectively improve the retrieval accuracy.
机译:数字化文档的快速增长催生了对文档图像检索的高要求。虽然常规的文档图像检索方法依赖于基于复杂OCR的文本识别和文本相似性检测,但本文提出了一种基于内容的新方法,其中更多地关注特征提取和融合。在提出的方法中,通过不同的CNN模型提取文档图像的多个特征。然后,将提取出的CNN特征进行归约并融合为加权平均特征。最后,基于特征与所提供查询图像的相似度对文档图像进行排序。对从学术论文转换成的包含中英文文档的一组文档图像进行实验,结果表明,该方法具有很好的检索文本内容相似的文档图像的能力,并且融合了CNN特征可以有效地提高检索效率。准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号